oddwarg();

About

Games

Stuff

Music

Affiliates


Exceptions for GNU C

This project allows GNU C programs to use exception handling much like that of higher-level languages. This is achieved through a combination of setjmp/longjmp, thread-local state, statement expressions, nested functions, and cursed macros.

Download the code

Features:

  • Gracefully handles return statements in try constructs
  • Cleanup statements can be queued up inside a try block using finalisers
  • Catch blocks match exception types using bitfield masks
  • Curly braces are optional for single-statement try/catch blocks
  • Designed for use in multithreaded programs
  • Exceptions know the function, file and line number where they occurred
  • Exceptions can be chained, adding contextual information
  • Uncaught exceptions are printed and abort the program
  • Works on Windows 98 and Linux

Caveats:

  • Susceptible to the local variable clobbering problem.
    Variables declared before a try block and in the same function should be declared volatile if they are modified inside the try block and used after the try block.
  • Exceptions do not have complete call stack traces.
    But additional call stack elements can be added using chaining.

Demonstration:

Throwing an exception jumps out to the nearest matching catch block.

try {
    throw(1, "First");
    println("Not reached");
} catch(1) {
    exceptionPrint();
}

Result:

Exception: 0x00000001 First
at main(Demo.c:13))

Exceptions have flags which can be filtered with a catch mask.


try {
    throw(2, "Second");
    unreachable;
} catch(1) {
    unreachable;
} catch(2) {
    exceptionPrint();
}

Result:

Exception: 0x00000002 Second
at main(Demo.c:23))

Exceptions thrown by called functions can be caught.

They will propagate up the call stack recursively until there is a matching catch.

/*
 * Tries to parse the given string as an int.
 * Throws EX_INVALID if the string is NULL
 * Throws EX_PARSE if the string could not be parsed as an int
 */
int parseInt(const char *str)
{
    if(!str)
        throw(EX_INVALID, "Invalid input string (%p)", str);
    char *tail;
    errno = 0;
    
    long r = strtol(str, &tail, 10);
    
    if(tail == str)
        throw(EX_PARSE, "Not a number: %s", str);
    if(errno == ERANGE || r > INT_MAX || r < INT_MIN)
        throw(EX_PARSE, "Number out of range: %s", str);
    while(*tail) {
        if(isspace(*tail))
            tail++;
        else
            throw(EX_PARSE, "Extra symbols after number: %s", str);
    }
    return (int)r;
}

try {
    parseInt("Ten");
} catch(EX_PARSE) {
    exceptionPrint();
}

Result:

Exception: 0x00000100 Not a number: Ten
at parseInt(Demo.c:51))

If there is no matching catch, this will print the message and abort the program.

{
    int number = 0;
    println("Enter an integer");
    int count = scanf("%d", &number);
    if(count < 1)
        throw(EX_FORMAT, "Input was not an integer");
    println("Your number was: %i", number);
}

Result:

main(Demo.c:74): Enter an integer
wawa
Uncaught: 0x00000f00 Input was not an integer
at main(Demo.c:77))
Aborted (core dumped)

Alternatively, with appropriate input:

main(Demo.c:74): Enter an integer
42
main(Demo.c:78): Your number was: 42

Braces are optional for one liner blocks.

try
    println("10 == %i", parseInt(" 10 "));
catch(EX_PARSE)
    exceptionPrint();

Result:

main(Demo.c:85): 10 == 10

Exceptions can clobber local variables if they are modified inside a try block and used after.

Such variables should be declared volatile:

{
    volatile int number;
    try {
        number = parseInt(" 10 ");
    } catch(EX_PARSE) {
        number = -1;
    }
    if(number == -1)
        println("A problem has occurred");
    else
        println("number == %i", number);
}

Result:

main(Demo.c:103): number == 10

Caught exceptions are consumed, but can be rethrown.

This can be used to perform necessary cleanup, but still propagate the exception.

try {
    char *volatile mem;
    try {
        mem = malloc(1);
        throw(1, "Rethrow demo");

        println("Freeing memory (1)");
        free(mem); //Doesn't happen
    } catch(1) {
        println("Freeing memory (2)");
        free(mem); //Does happen
        rethrow;
    }
} catch(1) {
    exceptionPrint();
}

Result:

main(Demo.c:119): Freeing memory (2)
Exception: 0x00000001 Rethrow demo
at main(Demo.c:114))

Cleanup is usually better done with a finalizer, reducing code duplication and often avoiding the need for volatile.

Finalizers are nested functions that run on most exits of the try block (throws, return statements, normal exits), in reverse order.

Note the use of finalizers requires a try block.

try {
    char *mem = malloc(1);
    finally(free(mem));
    finally(println("Freeing memory"));
    
    throw(1, "Finalizer demo");
} catch(1) {
    exceptionPrint();
}

Result:

_onFinallyFunc(Demo.c:136): Freeing memory
Exception: 0x00000001 Finalizer demo
at main(Demo.c:138))

Special finalizers can be added which only run when an exception is thrown.

In this synthetic example the function releases an allocated string upon failure. On success, the string is returned and freeing it is the responsibility of the caller.

char *makeString(unsigned int num) {
    try {
        char *str = malloc(4);
        if(!str)
            throw(EX_MEMORY, "Failed to allocate str");
        except(println("Freeing str"); free(str));
        
        if(num >= 1000)
            throw(EX_BAD_VALUE, "Number is too big: %u", num);
        sprintf(str, "%u", num);
        return str;
    }
}

try {
    char *str = makeString(700);
    println("Freeing str = %s", str);
    free(str);
} catch(EX_BAD_VALUE)
    exceptionPrint();

try {
    char *str = makeString(9999);
    println("Freeing str = %s", str);
    free(str);
} catch(EX_BAD_VALUE)
    exceptionPrint();

Result:

main(Demo.c:165): Freeing str = 700
_onFinallyFunc(Demo.c:154): Freeing str
Exception: 0x00000200 Number is too big: 9999
at makeString(Demo.c:157))

Exceptions can be chained to make them more informative.

This is used to add context to exceptions thrown by utility functions such as parseInt. It exposes the code which called the function without the need for a debugger. You can also add textual information such as which file is being parsed (or which argument).

try {
    if(argc<2)
        throw(EX_INVALID, "Not enough arguments");
    else
        println("%i", parseInt(argv[1]));
} catch(EX_RECOVERABLE) {
    throwChained(exceptionCurrent()->flags, "Failed to parse argument 1 as an integer");
}

Result:

Uncaught: 0x00000100 Failed to parse argument 1 as an integer
at main(Demo.c:189))
Caused by: 0x00000100 Not enough arguments
at main(Demo.c:185))
Aborted (core dumped)

Alternatively, with appropriate argument (8):

main(Demo.c:187): 8

How

Exceptions are allocated in thread-local storage, so that they can survive the jumps that happen on throw statements, without relying on the heap. To facilitate chaining, a circular buffer of several (16) simultaneous exceptions is allocated. This structure also contains a linked-list (stack) of try-block states, and is referred to as the exception stack.

try
    throw(EX_INVALID, "Foo");
catch(EX_INVALID)
    exceptionPrint();

Expands to (formatted):

for (__attribute__((cleanup(exceptionLeaveTry))) struct TryData _try = exceptionBeginTry();; exceptionExitTry(&_try))
  if(_try.phase == 2)
    break;
  else if(exceptionCheckTry(&_try, setjmp(_try.env)))
    while(true) exceptionThrow((CodeLoc) { __FUNCTION__, __FILE__, __LINE__ }, false, EX_INVALID, NULL, "Foo");
  else if(_try.uncaught && exceptionCheckTypes(&_try, EX_INVALID))
    exceptionPrint();

From the top:

A for loop is used for macro expansion reasons. Notably, the try construct has to declare some state in its own scope, and the for loop is the only control structure that can do this without curly braces.

__attribute__((cleanup(exceptionLeaveTry))) struct TryData _try = exceptionBeginTry();;

This declares and initialises the state data structure used by the try construct. The GCC-specific cleanup attribute makes the exceptionLeaveTry function run when this variable falls out of scope. This function runs the finalisers if they haven't run already; This is the only way to make them run on return statements. It also pops _try off the exception stack.

The for loop condition is empty (it cannot terminate the loop).

[...] exceptionExitTry(&_try))
  if(_try.phase == 2)
    break;

The advancement function (exceptionExitTry) runs at most once. It runs the finalisers if they haven't run already and sets _try.phase to 2. If the exception stack has an uncaught exception, it rethrows it to the next try block up the stack (and runs its finalisers). If control enters the loop body for a second iteration, it breaks immediately.

  else if(exceptionCheckTry(&_try, setjmp(_try.env)))

The call setjmp(_try.env) sets the jump target for throws within this try block to this line. When it does this, its value is 0, and exceptionCheckTry reacts by pushing _try onto the exception stack, setting _try.phase to 0, and returning true, so that the try block body is executed.

    while(true) exceptionThrow((CodeLoc) { __FUNCTION__, __FILE__, __LINE__ }, false, EX_INVALID, NULL, "Foo");

This is the try block body, expanded from the throw. It is written as an infinite loop to convince development tools that this statement does not terminate normally. (It doesn't actually loop.) The arguments are information necessary to construct the exception, including information about the current line of code.

The exceptionThrow function determines which try block to jump to (the current block if inside the try body, the outer try block otherwise), and runs its finalisers if they haven't run already, then constructs the exception on the exception stack and performs the longjmp.

Control now returns to the line with setjmp, whose value is now 1. exceptionCheckTry responds by setting _try.phase to 1 and returns false, so control continues to the catch clause(s):

  else if(_try.uncaught && exceptionCheckTypes(&_try, EX_INVALID))
    exceptionPrint();

Since there is an uncaught exception, and its flags is EX_INVALID, the catch block is run, which simply prints information about this exception.

Control then falls through to exceptionExitTry and the second for loop iteration. It terminates normally.

If there had been no matching catch clause for the thrown exception, control would have fallen through to exceptionExitTry, which would have detected the uncaught exception and thrown (longjmp'd) to the outer try block. (If no such thing exists, it instead complains and aborts the program).

If the try block body had not thrown, control would have fallen through to exceptionExitTry and the second for loop iteration, terminating normally.

If there had been a throw in the catch body, this would have thrown to the outer try block.

There is a final and troublesome case: When a finaliser throws an exception.

The first problem is this would stop any remaining finalisers from running, which could cause resource leaks. For this reason, exceptions in finalisers are caught and suppressed.

If finalisers are being run as a result of an exception being thrown, then that exception is still thrown, and the last suppressed exception can be accessed at exceptionStack->suppressed. Otherwise, the last suppressed exception is thrown as if it happened inside the try body.

Exceptions that happen in finalisers are treated as low priority and can be lost. If they are chained, only the topmost element is remembered.

Exception chains are stored consecutively in the circular buffer. It is possible for this buffer to overflow, which leads to loss of the deepest elements. In particular, poorly behaved finalisers can populate this buffer with elements unrelated to the current exception. Each chain has its own sequence number to enable detection of unrelated elements.

The thread-local storage is normally implemented using compiler support for thread local variables, but on Windows 98 this is not available, so the Tls API is used instead.


Why

Error conditions are inherent to computer programs: Network transmissions can fail, systems can have missing functionality, users can provide invalid input files, perform illegal sequences of actions, disconnect hardware at inopportune moments, storage mediums can run out of space or become corrupt, etc. Besides, programmers are not perfect and mistakes are made.

It is important to be able to detect and handle errors in a computer program:

  • Programmers need to know as much as possible about what went wrong in order to improve the program
  • Users need descriptive error messages to understand how to deal with errors
  • Some types of errors require cleanup actions to take place to avoid leaving the system in a corrupted state
  • Programs sometimes need to recover from errors and continue normal operation

The typical way of managing errors in C is using error return codes. This concept is easy to understand but has many weaknesses:

  1. This consumes the return value, making it harder for functions to return actual values.
  2. Error codes have limited descriptive power. They cannot include information such as file names, nor sufficient information about where in the program the error originated.
  3. Users cannot read error codes, and translating them to text for presentation is a hassle.
  4. Different libraries use different error code lists, which overlap with each other. A function that propagates errors originating from different libraries must perform error code translation.
  5. Error returns are viral: All calls to functions that may be used incorrectly or fail in some other way should be accompanied by an error check and error handling. Errors that not dealt with immediately need to be manually propagated, consuming the return value of the outer function, and adding yet more error checks/handling to its callers, often recursively.
  6. Performing exact and appropriate cleanup (unlocking mutexes, freeing memory, deleting files, undoing partial changes, etc) is difficult in functions that have multiple failure points.

The problem of consuming the return value can be alleviated by storing the error code using a separate mechanism (e.g. errno), and it is tempting to try to solve the problem of limited descriptive power (and potentially presentation to the user) using logging. However, I have not found these to be satisfactory.

An important insight is that the same error condition can be an actual error in one context and an ordinary circumstance in another. For instance, a function failing to open a file because it does not exist can be fatal or a non-issue depending on what the file is for. This limits the capacity for logging to successfully solve any of the problems of error codes in complex software:

  • The low level code (where the error manifests) often doesn't know what it's being used for or how errors will be handled. Logging output here can cause misleading and unnecessary log spam.
  • The high level code (which defines the meaning) only knows what it can determine from the error codes and its own state. This is not necessarily sufficient for good log output.

(This is not to say logging is a poor strategy, only that it struggles to solve the problems inherent to error codes.)

Exceptions are a different paradigm which solves a greater number of problems:

  1. Exceptions do not consume the return value
  2. Exceptions can contain information about where the error occurred and what caused it
  3. Exceptions can contain error messages to be displayed if the error could not be handled gracefully
  4. Exceptions are less susceptible to collisions between libraries
  5. Exceptions are automatically propagated to the nearest handler, leading to far more concise code
  6. Exceptions can perform automatic and incremental cleanup

Sometimes I yearn for C, but using error codes in larger projects is annoying to do correctly and makes debugging difficult. There are other projects that add exception handling to C, but this one is optimised for my usage.

OwO